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Randomise the list 



Annotation 








Name Chromosome Copie 


s Start 


Stop 


CYP2D6 


3 1 


56 


1065 


ACOl 


5 1 


7865 


8763 


HIAl 


12 1 


12 


2000 


ABCA6 


X 1 1 


5748 


6003 


i 


Annotation 








Name Chromosome Copies Start 


Stop 


1 CYP2D6 


3 1 


56 


1065 


2 ACOl 


5 1 


7865 


8763 


3 HIAl 


12 1 


12 


2000 


4ABCA6 


X 1 


5748 


6003 



Separate the datasets of 
gene names and gene 
annotation information. 



Gene 


aames 


1 


CYP2D6 


2 


ACOl 


3 


HIAl 


4 


ABCA6 



3, 1, 56, 1065 
5, 1, 7865, 8763 
12, 1, 12, 2000 
X, 1, 5748, 6003 



Figure 1 
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Sequence: 



Read n characters 



n=2 

ac aa ac ca ca 



Convert the string of 
characters to a sjonbol by 
using a lookup table. 




Lookup: aa = w, cc = y, 
ac = X, ca = z 



ac aa ac ca ca = X w X z z 



x3 z5 z4 xl w2 




1 Dataset 2 



Figure 2 



Identify geae of 
interest 

i 

Identify annotation 
for gene of interest 

i 



Determine sequence 
fragment identity and 
retrieve datasets 

i 

I Combine datasets ~| 

i 

Unrandomise the 
sequence o f numbered 
symbols 

i 

Convert the string of 
symbols to characters by 
using a lookup table. 

I 



Identify sequences of 
interest (eg. start and end 
of gene) using annotation. 
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B 

I HIAl I 

i 

Name Chromosome Copies Seq.FragID. Start Stop 
HIAl 12 1 94 10 1998 



I 



Dataset 94a 


Dataset 94b 


3541 2 


xzzxw 



i 



x3 z5 z4 xl w2 



xl w2 x3 z4 z5 



i 



Lookup : w = aa, y = cc 
x = ac, z = ca 

xwxzz =acaaaccaca 



Sequence: 


Start 


Stop 


acaaaccaca 


10 


1998 



Figures 



